Finding Groups in Large Data Sets
نویسنده
چکیده
This paper aims to give an overview of methods to find groups in large data sets, such as household expenditure survey data. These methods are grouped in three: cluster analysis, dimension reduction and basic explorative methods. The emphasis is put on a critical analysis and potential drawbacks, especially of inputs that have to be provided by the researcher. These may impose some structure not present in the data, thus defeating the purpose of revealing intrinsic patterns. In general, the more elaborate methods, such as cluster analysis, are delicate to apply, especially in the context of social sciences. Often, it may be best to limit oneself to more transparent approaches such as comparisons of basic statistics.
منابع مشابه
Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کاملIdentification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کاملApplication of Benford’s Law in Analyzing Geotechnical Data
Benford’s law predicts the frequency of the first digit of numbers met in a wide range of naturally occurring phenomena. In data sets, following Benford’s law, numbers are started with a small leading digit more often than those with a large leading digit. This law can be used as a tool for detecting fraud and abnormally in the number sets and any fabricated number sets. This can be used as an ...
متن کاملSolubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network
The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 ...
متن کاملSolubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network
The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002